Search results for "String searching algorithm"

showing 10 items of 15 documents

Secure and Privacy Preserving Pattern Matching in Distributed Cloud-based Data Storage

2019

Given two strings: pattern $p$ of length $m$ and text $t$ of length $n$ . The string matching problem is to find all (or some) occurrences of the pattern $p$ in the text $t$ . We introduce a new simple data structure, called index arrays, and design fast privacy-preserving matching algorithm for string matching. The motivation behind introducing index arrays is determined by the need for pattern matching on distributed cloud-based datasets with semi-trusted cloud providers. It is intended to use encrypted index arrays both to improve performance and protect confidentiality and privacy of user data.

021110 strategic defence & security studiesTheoretical computer scienceComputer sciencebusiness.industry0211 other engineering and technologiesCloud computing02 engineering and technologyString searching algorithmData structureEncryptionSimple (abstract algebra)Computer data storagePattern matchingbusinessBlossom algorithm2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)
researchProduct

Efficient Algorithms for Sequence Analysis with Entropic Profiles

2017

Entropy, being closely related to repetitiveness and compressibility, is a widely used information-related measure to assess the degree of predictability of a sequence. Entropic profiles are based on information theory principles, and can be used to study the under-/over-representation of subwords, by also providing information about the scale of conserved DNA regions. Here, we focus on the algorithmic aspects related to entropic profiles. In particular, we propose linear time algorithms for their computation that rely on suffix-based data structures, more specifically on the truncated suffix tree (TST) and on the enhanced suffix array (ESA). We performed an extensive experimental campaign …

0301 basic medicineCompressed suffix arrayTheoretical computer scienceEntropySuffix tree0206 medical engineeringGeneralized suffix tree02 engineering and technologyString searching algorithmInformation theorylaw.invention03 medical and health scienceslawGeneticsAnimalsHumansMathematicsApplied MathematicsSuffix arrayComputational BiologyDNASequence Analysis DNAData structure030104 developmental biologySuffixAlignment free Entropy Sequence analysis Sequence comparisonAlgorithms020602 bioinformaticsBiotechnologyIEEE/ACM Transactions on Computational Biology and Bioinformatics
researchProduct

Effect of Practice, Mapping, Stimulus and Size on String Matching

1987

The same-different discrepancy on a matching task on which the subject had to determine the number of common elements (physically identical and appearing in the same position) between two strings of size 1 to 4 was investigated. Manipulated also were the type of presentation (fixed or varied sets), amount of practice (four blocks), and type of stimulus (letters, words). Reaction times for pure positive responses (all same at each level) were faster than negative responses (all different), confirming the usual discrepancy shown in previous studies. The discrepancy was smaller for well-learned sets (fixed sets) and for words, indicating the development of a comparison process based on global…

AdultMaleCommunicationbusiness.industryExperimental and Cognitive PsychologyString searching algorithmStimulus (physiology)Sensory SystemsDiscrimination LearningCombinatoricsPattern Recognition VisualReadingPractice PsychologicalHumansAttentionFemalebusinessSize PerceptionMathematicsPerceptual and Motor Skills
researchProduct

"Indexing structures for approximate string matching

2003

In this paper we give the first, to our knowledge, structures and corresponding algorithms for approximate indexing, by considering the Hamming distance, having the following properties. i) Their size is linear times a polylog of the size of the text on average. ii) For each pattern x, the time spent by our algorithms for finding the list occ(x) of all occurrences of a pattern x in the text, up to a certain distance, is proportional on average to |x| + |occ(x)|, under an additional but realistic hypothesis.

CombinatoricsCombinatorics on wordsPattern recognition (psychology)Search engine indexingAutomata theoryHamming distanceString searching algorithmApproximate string matchingTime complexityMathematics
researchProduct

Multi-Dimensional Pattern Matching with Dimensional Wildcards: Data Structures and Optimal On-Line Search Algorithms

1997

We introduce a new multidimensional pattern matching problem that is a natural generalization of string matching, a well studied problem1. The motivation for its algorithmic study is mainly theoretical. LetA1:n1,?,1:nd be a text matrix withN=n1?ndentries andB1:m1,?,1:mr be a pattern matrix withM=m1?mrentries, whered?r?1 (the matrix entries are taken from an ordered alphabet ?). We study the problem of checking whether somer-dimensional submatrix ofAis equal toB(i.e., adecisionquery).Acan be preprocessed andBis given on-line. We define a new data structure for preprocessingAand propose CRCW-PRAM algorithms that build it inO(logN) time withN2/nmaxprocessors, wherenmax=max(n1,?,nd), such that …

Control and OptimizationSuffix treeBlock matrixWildcard characterString searching algorithmcomputer.file_formatData structurelaw.inventionCombinatoricsComputational MathematicsMatrix (mathematics)Computational Theory and MathematicsSearch algorithmlawPattern matchingcomputerMathematicsJournal of Algorithms
researchProduct

Periodicity and repetitions in parameterized strings

2008

AbstractOne of the most beautiful and useful notions in the Mathematical Theory of Strings is that of a Period, i.e., an initial piece of a given string that can generate that string by repeating itself at regular intervals. Periods have an elegant mathematical structure and a wealth of applications [F. Mignosi and A. Restivo, Periodicity, Algebraic Combinatorics on Words, in: M. Lothaire (Ed.), Cambridge University Press, Cambridge, pp. 237–274, 2002]. At the hearth of their theory, there are two Periodicity Lemmas: one due to Lyndon and Schutzenberger [The equation aM=bNcP in a free group, Michigan Math. J. 9 (1962) 289–298], referred to as the Weak Version, and the other due to Fine and …

Discrete mathematicsLemma (mathematics)Algebraic combinatoricsCombinatorics on wordsSettore INF/01 - InformaticaApplied MathematicsParameterized complexityParameterized stringsString searching algorithmString (physics)Periodic functionCombinatoricsCombinatorics on wordsDiscrete Mathematics and CombinatoricsString periodicityUniquenessCombinatorics on Words AlgorithmsMathematics
researchProduct

On Table Arrangements, Scrabble Freaks, and Jumbled Pattern Matching

2010

Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a “jumbled string”) in the text s requires to find a substring t of s with p(t) = q. The corresponding decision problem is to verify whether at least one such match exists. So, for example for the alphabet Σ = {a, b, c}, the string s = abaccbabaaa has Parikh vector p(s) = (6,3,2), and the Parikh vector q = (2,1,1) appears once in s in position (1,4). Like its more precise counterpart, the renown Exact String Matching, Jumbled Pattern Matching has ubiquitous applications, e.g., string matching with a dyslectic word processor, table rearrangements, …

Discrete mathematicsParikh vectors jumbled pattern matching scrabble approximate pattern matching000AnagramParikh vectorsString searching algorithmApproximate string matchingDecision problemalgorithmsData structureJumbled Pattern MatchingSubstringscrabbleapproximate pattern matchingString MatchingWavelet TreePattern matchingMathematics
researchProduct

On the Power of Non-adaptive Learning Graphs

2012

We introduce a notion of the quantum query complexity of a certificate structure. This is a formalisation of a well-known observation that many quantum query algorithms only require the knowledge of the disposition of possible certificates in the input string, not the precise values therein. Next, we derive a dual formulation of the complexity of a non-adaptive learning graph, and use it to show that non-adaptive learning graphs are tight for all certificate structures. By this, we mean that there exists a function possessing the certificate structure and such that a learning graph gives an optimal quantum query algorithm for it. For a special case of certificate structures generated by cer…

FOS: Computer and information sciencesDiscrete mathematicsQuantum PhysicsTheoretical computer scienceComputational complexity theoryComputer scienceGeneral MathematicsExistential quantificationFOS: Physical sciencesGraph theoryString searching algorithmComputational Complexity (cs.CC)Query optimizationCertificateUpper and lower boundsTheoretical Computer ScienceComputational MathematicsComputer Science - Computational ComplexityComputational Theory and MathematicsBounded functionAdaptive learningSpecial caseQuantum Physics (quant-ph)Quantum computerMathematics2013 IEEE Conference on Computational Complexity
researchProduct

Binary jumbled string matching for highly run-length compressible texts

2012

The Binary Jumbled String Matching problem is defined as: Given a string $s$ over $\{a,b\}$ of length $n$ and a query $(x,y)$, with $x,y$ non-negative integers, decide whether $s$ has a substring $t$ with exactly $x$ $a$'s and $y$ $b$'s. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries in constant time. The fastest algorithms for construction of this index have running time $O(n^2/\log n)$ [Burcsi et al., FUN 2010; Moosa and Rahman, IPL 2010], or $O(n^2/\log^2 n)$ in the word-RAM model [Moosa and Rahman, JDA 2012]. We propose an index constructed directly from the run-length encoding of $s$. The construction time of our index i…

FOS: Computer and information sciencesString algorithmsStructure (category theory)Binary numberG.2.1Data_CODINGANDINFORMATIONTHEORY0102 computer and information sciences02 engineering and technologyString searching algorithm01 natural sciencesComputer Science - Information RetrievalTheoretical Computer ScienceCombinatoricsdata structuresSimple (abstract algebra)Computer Science - Data Structures and AlgorithmsString algorithms; jumbled pattern matching; prefix normal form; data structures0202 electrical engineering electronic engineering information engineeringParikh vectorData Structures and Algorithms (cs.DS)Run-length encodingMathematics68W32 68P05 68P20String (computer science)prefix normal formSubstringComputer Science Applicationsjumbled pattern matching010201 computation theory & mathematicsData structureSignal ProcessingRun-length encoding020201 artificial intelligence & image processingConstant (mathematics)Information Retrieval (cs.IR)Information SystemsInformation Processing Letters
researchProduct

From Nerode's congruence to Suffix Automata with mismatches

2009

AbstractIn this paper we focus on the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. As first result we give a characterization of the Nerode’s right-invariant congruence that is associated with Sk. This result generalizes the classical characterization described in [A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoretical Computer Science, 40, 1985, 31–55]. As second result we present an algorithm that makes use of Sk to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r of a text, where r is the…

General Computer ScienceOpen problem[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciences02 engineering and technologyString searching algorithm01 natural sciencesTheoretical Computer ScienceCombinatoricsDeterministic automatonSuffix automata0202 electrical engineering electronic engineering information engineeringCombinatorics on words Indexing Suffix Automata Languages with mismatches Approximate string matchingMathematicsDiscrete mathematicsCombinatorics on wordsApproximate string matchingSettore INF/01 - InformaticaLanguages with mismatchesComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)PrefixCombinatorics on wordsDeterministic finite automaton010201 computation theory & mathematicsSuffix automatonIndexing020201 artificial intelligence & image processingSuffixComputer Science::Formal Languages and Automata TheoryComputer Science(all)
researchProduct